63 research outputs found
Inverting Cryptographic Hash Functions via Cube-and-Conquer
MD4 and MD5 are seminal cryptographic hash functions proposed in early 1990s.
MD4 consists of 48 steps and produces a 128-bit hash given a message of
arbitrary finite size. MD5 is a more secure 64-step extension of MD4. Both MD4
and MD5 are vulnerable to practical collision attacks, yet it is still not
realistic to invert them, i.e. to find a message given a hash. In 2007, the
39-step version of MD4 was inverted via reducing to SAT and applying a CDCL
solver along with the so-called Dobbertin's constraints. As for MD5, in 2012
its 28-step version was inverted via a CDCL solver for one specified hash
without adding any additional constraints. In this study, Cube-and-Conquer (a
combination of CDCL and lookahead) is applied to invert step-reduced versions
of MD4 and MD5. For this purpose, two algorithms are proposed. The first one
generates inversion problems for MD4 by gradually modifying the Dobbertin's
constraints. The second algorithm tries the cubing phase of Cube-and-Conquer
with different cutoff thresholds to find the one with minimal runtime
estimation of the conquer phase. This algorithm operates in two modes: (i)
estimating the hardness of a given propositional Boolean formula; (ii)
incomplete SAT-solving of a given satisfiable propositional Boolean formula.
While the first algorithm is focused on inverting step-reduced MD4, the second
one is not area-specific and so is applicable to a variety of classes of hard
SAT instances. In this study, 40-, 41-, 42-, and 43-step MD4 are inverted for
the first time via the first algorithm and the estimating mode of the second
algorithm. 28-step MD5 is inverted for four hashes via the incomplete
SAT-solving mode of the second algorithm. For three hashes out of them this is
done for the first time.Comment: 40 pages, 11 figures. A revised submission to JAI
Parenclitic and Synolytic Networks Revisited
© 2021 Nazarenko, Whitwell, Blyuss and Zaikin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). https://creativecommons.org/licenses/by/4.0/Parenclitic networks provide a powerful and relatively new way to coerce multidimensional data into a graph form, enabling the application of graph theory to evaluate features. Different algorithms have been published for constructing parenclitic networks, leading to the question—which algorithm should be chosen? Initially, it was suggested to calculate the weight of an edge between two nodes of the network as a deviation from a linear regression, calculated for a dependence of one of these features on the other. This method works well, but not when features do not have a linear relationship. To overcome this, it was suggested to calculate edge weights as the distance from the area of most probable values by using a kernel density estimation. In these two approaches only one class (typically controls or healthy population) is used to construct a model. To take account of a second class, we have introduced synolytic networks, using a boundary between two classes on the feature-feature plane to estimate the weight of the edge between these features. Common to all these approaches is that topological indices can be used to evaluate the structure represented by the graphs. To compare these network approaches alongside more traditional machine-learning algorithms, we performed a substantial analysis using both synthetic data with a priori known structure and publicly available datasets used for the benchmarking of ML-algorithms. Such a comparison has shown that the main advantage of parenclitic and synolytic networks is their resistance to over-fitting (occurring when the number of features is greater than the number of subjects) compared to other ML approaches. Secondly, the capability to visualise data in a structured form, even when this structure is not a priori available allows for visual inspection and the application of well-established graph theory to their interpretation/application, eliminating the “black-box” nature of other ML approaches.Peer reviewedFinal Published versio
Integrated Information in the Spiking-Bursting Stochastic Model
This study presents a comprehensive analytic description in terms of the
empirical "whole minus sum" version of Integrated Information in comparison to
the "decoder based" version for the "spiking-bursting" discrete-time,
discrete-state stochastic model, which was recently introduced to describe a
specific type of dynamics in a neuron-astrocyte network. The "whole minus sum"
information may change sign, and an interpretation of this transition in terms
of "net synergy" is available in the literature. This motivates our particular
interest to the sign of the "whole minus sum" information in our analytical
consideration. The behavior of the "whole minus sum" and "decoder based"
information measures are found to bear a lot of similarity, showing their
mutual asymptotic convergence as time-uncorrelated activity is increased, with
the sign transition of the "whole minus sum" information associated to a rapid
growth in the "decoder based" information. The study aims at creating a
theoretical base for using the spiking-bursting model as a well understood
reference point for applying Integrated Information concepts to systems
exhibiting similar bursting behavior (in particular, to neuron-astrocyte
networks). The model can also be of interest as a new discrete-state test bench
for different formulations of Integrated Information
Multi-input distributed classifiers for synthetic genetic circuits
For practical construction of complex synthetic genetic networks able to
perform elaborate functions it is important to have a pool of relatively simple
"bio-bricks" with different functionality which can be compounded together. To
complement engineering of very different existing synthetic genetic devices
such as switches, oscillators or logical gates, we propose and develop here a
design of synthetic multiple input distributed classifier with learning
ability. Proposed classifier will be able to separate multi-input data, which
are inseparable for single input classifiers. Additionally, the data classes
could potentially occupy the area of any shape in the space of inputs. We study
two approaches to classification, including hard and soft classification and
confirm the schemes of genetic networks by analytical and numerical results
- …